Logical Markov Decision Programs and the Convergence of Logical TD(lambda)
نویسندگان
چکیده
Recent developments in the area of relational reinforcement learning (RRL) have resulted in a number of new algorithms. A theory, however, that explains why RRL works, seems to be lacking. In this paper, we provide some initial results on a theory of RRL. To realize this, we introduce a novel representation formalism, called logical Markov decision programs (LOMDPs), that integrates Markov Decision Processes (MDPs) with Logic Programs. Using LOMDPs one can compactly and declaratively represent complex MDPs. Within this framework we then devise a relational upgrade of TD(λ) called logical TD(λ) and prove convergence. Experiments validate our approach.
منابع مشابه
Logical Markov Decision Programs
Motivated by the interest in relational reinforcement learning, we introduce a novel representation formalism, called logical Markov decision programs (LOMDPs), that integrates Markov Decision Processes with Logic Programs. Using LOMDPs one can compactly and declaratively represent complex relational Markov decision processes. Within this framework we then develop a theory of reinforcement lear...
متن کامل"Say EM" for Selecting Probabilistic Models for Logical Sequences
Many real world sequences such as protein secondary structures or shell logs exhibit a rich internal structures. Traditional probabilistic models of sequences, however, consider sequences of flat symbols only. Logical hidden Markov models have been proposed as one solution. They deal with logical sequences, i.e., sequences over an alphabet of logical atoms. This comes at the expense of a more c...
متن کاملRelational Linear Programs
We propose relational linear programming, a simple framework for combing linear programs (LPs) and logic programs. A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical concepts of objects, relations, and quantified variables. This allows one to express the LP objective and constraints relationally for a varying number of i...
متن کاملOn Isomorphism of Dependent Products in a Typed Logical Framework
A complete decision procedure for isomorphism of kinds that contain only dependent product, constant Type and variables is obtained. All proofs are done using Z. Luo’s typed logical framework. They can be easily transferred to a large class of type theories with dependent product. 1998 ACM Subject Classification F.4.1 Lambda calculus and related systems, D.2.13 Reuse models, D.3.3 Frameworks
متن کاملO / 0 61 21 06 v 1 21 D ec 2 00 6 On completeness of logical relations for monadic types ⋆
Software security can be ensured by specifying and verifying security properties of software using formal methods with strong theoretical bases. In particular, programs can be modeled in the framework of lambda-calculi, and interesting properties can be expressed formally by contextual equivalence (a.k.a. observational equivalence). Furthermore, imperative features, which exist in most real-lif...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004